Bioinformatics-based identification and validation of hub genes associated with aging in patients with coronary artery disease

Coronary artery disease (CAD) is the most common aging-related disease in adults. We used bioinformatics analysis to study genes associated with aging in patients with CAD. The microarray data of the GSE12288 dataset were downloaded from the Gene Expression Omnibus database to obtain 934 CAD-associated differentially expressed genes. By overlaying them with aging-related genes in the Aging Atlas database, 33 differentially expressed aging-related genes (DEARGs) were identified. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes enrichment analyses revealed that the 33 DEARGs were mainly enriched in cell adhesion and activation, Th17 and Th1/Th2 cell differentiation, and longevity regulation pathways. Hub genes were further screened using multiple algorithms of Cytoscape software and validation set GSE71226. Clinical samples were then collected, and the expression of hub genes in whole blood was detected by real-time quantitative polymerase chain reaction, enzyme-linked immunosorbent assay, and western blot at the transcription and translation levels. Finally, HSP90AA1 and CEBPA were identified as hub genes. The results of this study suggest that HSP90AA1 and CEBPA are closely related to CAD. These findings provide a theoretical basis for the association between aging effectors and CAD, and indicate that these genes may be promising biomarkers for the diagnosis and treatment of CAD.


AGING
neurodegenerative diseases, cardiovascular and cerebrovascular diseases, and tumors [6,7].China's aging population is growing; the number of senior citizens is expected to reach 480 million by 2050, accounting for approximately 25% of Asia's elderly population by that time [8].The incidence of several age-related diseases, including cardiovascular diseases [9], is increasing on an annual basis.This trend is making life more challenging for older individuals and placing a significant economic burden on society.Thus, there is an urgent need to perform research on the aging process, explore the molecular mechanism of the occurrence and development of aging-related diseases, and identify therapeutic targets for agingrelated diseases.The correlation between CAD and aging-related genes (ARGs) is unclear, and which ARGs are essential for the development of CAD is unknown.Further research is needed to identify therapeutic biomarkers for CAD based on potential ARGs.
In this study, we examined the relationship between ARGs and CAD using bioinformatics methods for the first time.We screened differentially expressed ARGs (DEARGs) based on the Gene Expression Omnibus (GEO) and Aging Atlas database.Gene Ontology (GO) enrichment analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, and protein-protein interaction (PPI) analysis were performed on the DEARGs.In addition, we used multiple algorithms from Cytoscape software to screen hub genes and validate them with a separate dataset.Blood samples were collected, and the mRNA and protein expression levels of hub genes in whole blood from patients in the CAD group and control group were detected by real-time quantitative polymerase chain reaction (RT-qPCR), enzymelinked immunosorbent assay (ELISA), and western blot.Finally, the hub genes of DEARGs in CAD were identified.A flow chart of this study is shown in Figure 1.

Functional enrichment analysis of DEARGs
The functional enrichment analysis chart of the DEARGs shows the four most significantly enriched GO terms (Figure 3A, 3B).The enriched biological processes include positive regulation of cell adhesion, positive regulation of cell activation, positive regulation of leukocyte activation, and positive regulation of cell-cell adhesion.The enriched cellular components include RNA polymerase II transcription regulator complex, immunological synapses, protein complexes involved in cell adhesion, and serine-type peptidase complex.The enriched molecular functions are protein serine/threonine/tyrosine kinase activity, ubiquitin protein ligase binding, phosphoprotein binding, and protein phosphorylated amino acid binding.The KEGG enrichment analysis revealed that DEARGs play key roles in Th17 cell differentiation; human T-cell leukemia virus 1 infection; growth hormone synthesis, secretion, and action; Th1 and Th2 cell differentiation; and longevityregulating pathways (Figure 3C, 3D).

Analysis of PPI networks and identification of hub genes
PPI networks were constructed using the STRING database to identify interactions between DEARGs and further visualize the results comprising 33 nodes and 86 edges.Figure 4A shows the PPI network of the DEARGs.The circles represent genes, the lines represent PPIs between genes, and the results in the circles represent the protein structure.The colors of the lines represent evidence of PPIs.We used various algorithms from the Cytoscape software Cytohubba plugin, including Degree, MCC, MNC, EPC, DMNC, EcCentricity, and Closeness, to identify hub genes.We then created an UpSet map to visualize the results (Figure 4B).Finally, six hub genes were obtained: CEBPA, HIF1A, HSP90AA1, IL2, FOXO1, and STAT5A (Table 1).

Validation of differentially expressed hub genes in GSE datasets
The differences in the expression of the six hub genes in GSE12288 and GSE71226 are shown in Table 2.Among them, HSP90AA1, CEBPA, and FOXO1 were differentially expressed in the GSE71226 dataset, consistent with GSE12288, whereas HIF1A, IL2, and STAT5A were not differentially expressed.Figure 5A, 5C shows the differences in the expression levels of HSP90AA1, CEBPA, and FOXO1 between the CAD and control groups in the GSE12288 and GES71226 datasets.Receiver operating characteristic (ROC) curves were used to detect the diagnostic value of three hub genes for CAD in two datasets (Figure 5B, 5D).HSP90AA1 (area under the curve (AUC), 0.889), CEBPA (AUC, 1.000), and FOXO1 (AUC, 1.000) were detected in GSE71226 with high accuracy.

Validation of hub genes at the transcriptional level
The mRNA expression levels of HSP90AA1, CEBPA, and FOXO1 in whole blood of the CAD group and control group were detected in 100 clinical samples.Figure 6A shows that the mRNA expression level of HSP90AA1 was lower (P < 0.05) and that the mRNA expression level of CEBPA (P < 0.01) was higher in the CAD group than in the control group.These results are consistent with the GSE12288 and GSE71226 datasets.There was no significant difference in FOXO1 expression between the two groups (P = 0.54).

Validation of hub genes at the translational level
The ELISA results showed that the plasma levels of HSP90AA1, CEBPA, and FOXO1 were consistent with mRNA expression (Figure 6B).The levels of HSP90AA1 in plasma samples of the control group and CAD group were 214.18 ± 135.33 and 154.08 ± 57.13 ng/L, respectively (P < 0.01).The plasma CEBPA concentration was significantly higher in the CAD group than in the control group (127.78 ± 134.37 vs. 55.56 ± 85.59 ng/L, respectively; P < 0.05).The expression level of FOXO1 was lower in the CAD    6C), while the level of CEBPA protein was significantly higher in the CAD group than in the control group (Figure 6D).Possibly due to the low expression of FOXO1 in PBMCs, we were unable to visualize FOXO1 with western blot and therefore could not evaluate it.

HSP90AA1 and CEBPA were involved in the most significantly enriched GO and KEGG pathways
In the most enriched GO terms, CEBPA participated in positive regulation of leukocyte activation, positive regulation of cell activation, and RNA polymerase II transcription regulator complex, whereas HSP90AA1 participated in ubiquitin protein ligase binding.HSP90AA1 was also involved in the most enriched KEGG pathway, namely Th17 cell differentiation (Table 3).

DISCUSSION
With the gradual aging of the worldwide population, it is becoming more critical to study and formulate healthy aging strategies.Aging is a multifactorial progressive process influenced by genetic and epigenetic regulation, post-translational regulation, host-microbiome interactions, metabolic regulation, lifestyle, and many other factors [10][11][12][13][14].The Aging Atlas database used in this study focuses on big data generated by omics techniques, providing a wider range of valuable resources for the aging research community and other life scientists [15].The major risk factor for cardiovascular disease is aging [16].The study of ARGs associated with CAD has important clinical significance and social value.
In this study, DEARGs were identified for the first time by bioinformatics analysis, and the role of ARGs in the pathogenesis of CAD was discussed in the context of such analysis.We screened 33 DEARGs of CAD from the GEO dataset and Aging Atlas database.The GO   HSP90AA1 is associated with the most significantly enriched GO term (ubiquitin protein ligase binding) and the most enriched KEGG pathway (Th17 cell differentiation).The significantly enriched GO items in which CEBPA participates are positive regulation of leukocyte activation, positive regulation of cell activation, and RNA polymerase II transcription regulator complex.
Heat shock proteins (HSPs) are a class of highly conserved stress proteins that have molecular chaperone activity and are involved in various aspects of protein biogenesis, including folding, oligomer assembly, transport to specific subcellular compartments, and controlled switching between active/inactive conformations [17].Furthermore, HSPs are tightly controlled by cellular regulatory mechanisms that protect cells and tissues from the misfolding of denatured proteins by regulating transcription and translation [18].HSP90AA1 is the most widely studied member of the HSP family, and its primary role is to maintain protein homeostasis and protect cells.Animal experiments have confirmed that HSP90 (AA1) plays an important role in myocardial ischemia/reperfusion (I/R) injury and cardiac protection [19][20][21][22][23]. HSP90 (AA1)-mediated anti-apoptosis plays a crucial role in preconditioning cardiac protection [20,21].Liraglutide preconditioning elevates HSP90AA1 levels, inhibits the inflammatory response and C5a/NF-κB signaling, alleviates I/R-induced cardiocyte apoptosis, and protects the heart [24].According to a study by Zhu et al. [25], knockdown of HSP90AA1 can enhance the cardiomyocyte apoptosis induced by oxygen-glucose deprivation, and inhibition of miR-1 can lead to increases in HSP90AA1 and Bcl-2; these processes are conducive to protection against myocardial I/R injury.HSP90AA1 also plays a relevant role in longevity.HSP90AA1mediated regulation of a mammalian transcription factor EB ortholog was found to be involved in the extended lifespan of Caenorhabditis elegans in the absence of its food source bacteria [26].
CEBPA is a myeloid transcription factor, and mutations in this protein play a crucial role in the pathogenesis of hematologic tumors [27,28].However, such mutations have not been thoroughly studied in cardiovascular disease and in promoting atherosclerosis.One study showed that CEBPA mediates epicardial activation during heart development and injury and that disrupttion of CEBPA signaling in the epicardial tissue in adults reduces injury-induced neutrophil infiltration and improves cardiac function [29].Together with PPARγ, CEBPA regulates the adipogenesis process and is involved in the sequence expression of adipocytespecific proteins [30][31][32][33][34][35][36].CEBPA is upregulated in unstable plaques, and overexpression of CEBPA may contribute to the occurrence and development of atherosclerosis [37,38].However, Bristol et al. [39] observed that CEBPA and CEBPB bind the TNFR1 promoter, increase its expression through a positive feedback mechanism, and induce an increase in TNF expression.TNF-α promotes the development of atherosclerosis [40][41][42].CEBPA has also been studied in relation to aging.Aging exacerbates acute and chronic alcohol-induced liver injury in mice and humans by inhibiting the sirtuin 1-CEBPA-mirNA-223 axis of neutrophils [43].Podocyte CEBPA deficiency can aggravate podocyte senescence and kidney injury in senescent mice [44].
In summary, 33 DEARGs were identified between the control and CAD samples by bioinformatics analysis.
Based on the dataset and clinical sample validation, HSP90AA1 and CEBPA were the final hub genes.The results of this study suggest that HSP90AA1 and CEBPA are closely related to CAD, providing a theoretical basis for the association between aging effectors and CAD.These genes may be promising biomarkers for the diagnosis and treatment of CAD.

Aging-related database and microarray data
In total, 502 ARGs were obtained from the Aging Atlas database (https://ngdc.cncb.ac.cn/).The microarray was downloaded from the NCBI GEO database containing two transcription profiles (GSE12288 and GSE71226.) The GSE12288 dataset was used as a training set and included 110 patients with CAD and 112 healthy individuals.The GSE71226 dataset was used as the validation set.

Analysis of DEARGs
The DEGs from the CAD group and the control group within the GSE12288 dataset were analyzed using the R language software package limma (version 4.2.1).Genes in each sample that met the criteria of P < 0.05 were retained.The results of the differential expression analysis were visualized using the "ggplot2" package to create volcano plots.The DEGs were then compared with 502 ARGs to identify DEARGs.The specific and overlapping components of DEGs and ARGs were analyzed, and the results were visualized as Venn diagrams using the "ggplot2" and "VennDiagram" packages.Furthermore, the "ComplexHeatmap" package was used to create a heatmap representing the expression patterns of DEARGs.

GO and KEGG
GO functional annotation refers to the use of standard expression terms to describe the biological function of genes and proteins in different databases.GO annotations contain three aspects of biological content: Biological Processes, Cellular Components, and Molecular Functions [45].KEGG is a bioinformatics resource on genomes that establishes links from the collection of genes within the genome to the high-level functioning of cells and organisms [46].GO and KEGG enrichment analyses were performed using R software to determine the potential biological function of these DEARGs in CAD.Specifically, the R software "ClusterProfiler" package was used for enrichment analysis after ID conversion of the input DEARGs list, and the "ggplot2" package was used for visualization of the enrichment analysis results.

PPI network and hub genes
The list of proteins was uploaded to an online website called STRING (https://string-db.org/),thereby building the PPI network to analyze the internal connections among DEARGs.Using the Cytohubba plugin in Cytoscape software, 7 algorithms (Closeness, Degree, DMNC, EcCentricity, EPC, MCC, and MNC) were selected to obtain the top 10 genes in each algorithm.The unique and common parts of each group of data were analyzed.The "ggplot" package was used to create an UpSet diagram for visualization of the results.We screened the first six genes as hub genes for further research.

Validation of hub genes and mapping of ROC curve
We used the validation set GSE71226 to verify whether the differential expression of hub genes was consistent with GSE12288, and we screened out genes with the same differential expression.Differences in the expression levels of hub genes between the control group and the CAD group were further verified in the GSE12288 and GSE71226 datasets, and the diagnostic value of hub genes in patients with CAD was tested by ROC curves.

Collection of clinical samples
The clinical study included 50 patients with CAD and 50 control patients treated in Tai'an Central Hospital.The patient's clinical data are shown in Table 4.All participants underwent coronary angiography, and stenosis was assessed by a cardiologist.Samples meeting the following criteria were included in the CAD group: at least one major coronary artery with stenosis of > 50% [47] and patient age of 50-80 years.
The degree of coronary artery stenosis in the control group was < 50%, and the age and sex of the patients in the control group were consistent with those of the patients in the CAD group.Patients with a history of diabetes, cancer, kidney failure, liver disease, or other chronic diseases requiring immunosuppressants, antiinflammatory drugs, or steroids were excluded.The Medical Ethics Committee of Tai'an Central Hospital approved the research procedure, which was performed in accordance with the Declaration of Helsinki (revised in 2013).All patients provided written informed consent prior to the trial.

ELISA and western blot analysis
Three milliliters Ficol-Hypaque (TBD, Tianjin, China) was placed into a 15-ml centrifugation tube.The remaining whole blood samples from the previous step were carefully absorbed and added to the surface of the separation solution.Centrifugation was performed at 550 × g on a horizontal rotor centrifuge for 30 min.The first layer obtained by density gradient centrifugation was blood plasma.The expression levels of HSP90AA1, CEBPA, and FOXO1 in plasma were detected with an ELISA kit (Meimian, Jiangsu, China).Three duplicate holes were set up for each gene in each sample.The second layer obtained by centrifugation comprised PBMCs.This second layer of cells was gently absorbed, suspended with phosphate-buffered saline, and centrifuged at 250 × g for 10 min.If red blood cell precipitation was present, the supernatant was discarded and precipitated with red blood cell lysate (TBD).

Statistical analysis
Statistical analyses were conducted using R version 4.1.2.Student's t-test or the Mann-Whitney U-test was used for statistical analysis between two groups based on the normality of data.Data are presented as mean ± standard deviation.The chi-square test was used for categorical variables.P < 0.05 was considered statistically significant.

Figure 2 .
Figure 2. DEARGs between CAD and control groups.(A) Volcano plot of the DEGs from GSE12288 with P < 0.05 as the threshold value.Red and blue dots indicate significantly upregulated and downregulated genes, respectively.(B) Venn diagram of CAD DEGs and ARGs.(C) Heatmap of the expression of 33 ARGs and CAD-related genes.

Figure 3 .
Figure 3. GO and KEGG enrichment analyses of 33 DEARGs.(A) Bubble plot of enriched GO terms.(B) The circle plot shows the top 12 GO terms.The inner circle represents the z-scores, and the outer circle represents the number of genes in the GO terms.Red indicates upregulated ARGs, and green indicates downregulated ARGs.(C) KEGG pathways of DEARGs.(D) Chord plot of enriched KEGG pathways.Abbreviations: BP: biological process; CC: cellular component; MF: molecular function.

Figure 4 .
Figure 4. PPI network and hub genes.(A) PPI network analysis of DEARGs in the GSE12288 dataset.The circles represent genes, the lines represent PPIs between genes, and the results in the circles represent the protein structures.The colors of the lines represent evidence of PPIs.(B) UpSet map obtained by crossing the hub genes generated by the seven algorithms.

Figure 6 .
Figure 6.Differential expression of hub genes at transcription and translation levels in clinical samples.(A) Relative mRNA levels of HSP90AA1, CEBPA, and FOXO1 by RT-qPCR analysis in whole blood among controls (n = 50) and patients with CAD (n = 50).(B) Plasma expression levels of HSP90AA1, CEBPA, and FOXO1 in controls and patients with CAD.(C) Western blot analyses of HSP90AA1 protein levels in PBMCs, including representative blot images and a densitometric summary of the blot analysis after normalization to GAPDH.(D) Western blot analyses of CEBPA protein levels in PBMCs, including representative blot images and a densitometric summary of the blot analysis after normalization to GAPDH.* P < 0.05, ** P < 0.01.Vertical bars represent standard error.